Notes 11/11: * figure out dcast issues * replacing NA/0s within pipe

1 Data overview:

  • Fish follows (2 min) conducted at sites in Antigua (6), Barbuda (3) and Bonaire (4) from March-August 2017
  • Follows tracked time spent grazing, bite rates, and competitive interactions (among scarids and with damselfish)
  • Follows targeted Sparisoma viride, Scarus vetula, and Sparisoma aurofrenatum of both initial and terminal phases
  • Sparisoma aurofrenatum were not followed in Bonaire, but added in Antigua and Bonaire due to low abundances of other two species. Some sites in Antigua and Barbuda had no/low represenation from terminal phase viride and vetula
  • Most eventual analyses will likely focus exclusively on initial phase (standardized size window) S. viride and S. vetula
  • site level factors (benthic, fish, and rugosity) assessed at each of the 13 sites

Need to determine appropriate size range for comparison. Because they are not evenly distributed (i.e. larger fish in Bonaire, smaller in Barbuda), I will likely want to compare length-feeding relationships as opposed to pooled averages

2 Examing variable distributions and relationships

2.1 Predictor variables

Potential predictor variables are site-level fish, benthic, and rugosity values. These are likely correlated to one another, and I need to determine which ones I ultimately want to use (if modeling behavioral responses via any multivariate regressions). I can also move to SEM if I want to keep multiple correlated predictors.

First, check distribution of predictor variables of interest: not very normally distributed…

Variable selection notes: - excluding both carnivore variables as they are highly correlated with scarid biomass and total biomass, eventually I could make these more nuanced by distinguishing actual predators, but right now I don’t think it reflects actual predator populations of >15cm parrotfish - rugosity is highly correlated with turf cover, and scarid density - scarid density: removing for now, because I think it was a bit skewed from Barbuda juveniles - could eventually use consp. scarid length as another indicator of overfishing?

## Importance of components:
##                          PC1    PC2     PC3     PC4     PC5     PC6
## Standard deviation     1.694 1.5167 0.66913 0.50517 0.30585 0.18692
## Proportion of Variance 0.478 0.3834 0.07462 0.04253 0.01559 0.00582
## Cumulative Proportion  0.478 0.8614 0.93605 0.97859 0.99418 1.00000

2.2 Response variables

Fish-level grazing behaviors (as well as competitive interaction frequency)

2.2.1 Distributions

Variable selection notes: - for_bites is correlated with fr and for_dur, but I will play around with keeping it for now.

3 Grazing boxplots

##             Df   Sum Sq Mean Sq F value   Pr(>F)    
## island       2 19329868 9664934   24.48 5.08e-09 ***
## Residuals   80 31586112  394826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = fr ~ island, data = vet)
## 
## $island
##                       diff        lwr       upr     p adj
## Antigua-Bonaire  -936.8111 -1387.7231 -485.8992 0.0000115
## Barbuda-Bonaire -1058.3667 -1486.4053 -630.3282 0.0000002
## Barbuda-Antigua  -121.5556  -670.7081  427.5968 0.8575559

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2   0.731 0.4846
##       80
##             Df  Sum Sq Mean Sq F value   Pr(>F)    
## island       2 3522230 1761115    26.9 5.52e-10 ***
## Residuals   95 6218718   65460                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = fr ~ island, data = vir, white.adjust = T)
## 
## $island
##                       diff       lwr        upr     p adj
## Antigua-Bonaire -408.89527 -550.1969 -267.59360 0.0000000
## Barbuda-Bonaire  -70.05115 -234.5194   94.41711 0.5698132
## Barbuda-Antigua  338.84411  180.1422  497.54604 0.0000055

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.9125  0.405
##       95

##             Df Sum Sq Mean Sq F value  Pr(>F)    
## island       2  1.582  0.7911   22.98 1.3e-08 ***
## Residuals   80  2.753  0.0344                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = g_frac ~ island, data = vet, white.adjust = T)
## 
## $island
##                        diff        lwr        upr     p adj
## Antigua-Bonaire -0.27575852 -0.4088858 -0.1426312 0.0000122
## Barbuda-Bonaire -0.29697818 -0.4233524 -0.1706040 0.0000008
## Barbuda-Antigua -0.02121967 -0.1833515  0.1409122 0.9476109

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.1837 0.8325
##       80
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## island       2  2.808  1.4040   26.13 9.1e-10 ***
## Residuals   95  5.105  0.0537                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = g_frac ~ island, data = vir, white.adjust = T)
## 
## $island
##                        diff        lwr         upr     p adj
## Antigua-Bonaire -0.36517814 -0.4932046 -0.23715172 0.0000000
## Barbuda-Bonaire -0.06284811 -0.2118646  0.08616841 0.5760464
## Barbuda-Antigua  0.30233003  0.1585381  0.44612197 0.0000076

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.7501 0.4751
##       95

##             Df Sum Sq Mean Sq F value Pr(>F)
## island       2  0.314  0.1571   1.448  0.241
## Residuals   80  8.676  0.1085
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = br ~ island, data = vet)
## 
## $island
##                         diff        lwr        upr     p adj
## Antigua-Bonaire -0.123227748 -0.3595491 0.11309360 0.4303229
## Barbuda-Bonaire -0.132053164 -0.3563867 0.09228034 0.3427971
## Barbuda-Antigua -0.008825416 -0.2966343 0.27898343 0.9970480

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  2  10.408 9.601e-05 ***
##       80                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##             Df Sum Sq Mean Sq F value Pr(>F)
## island       2  0.045 0.02253   0.502  0.607
## Residuals   95  4.267 0.04491
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = br ~ island, data = vir, white.adjust = T)
## 
## $island
##                         diff         lwr       upr     p adj
## Antigua-Bonaire  0.045754540 -0.07128921 0.1627983 0.6222646
## Barbuda-Bonaire  0.043704156 -0.09252906 0.1799374 0.7260125
## Barbuda-Antigua -0.002050384 -0.13350720 0.1294064 0.9992399

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  2   10.63 6.823e-05 ***
##       95                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## island       2   1457   728.5   7.298 0.00126 **
## Residuals   76   7586    99.8                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 4 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = for_bites ~ island, data = vet)
## 
## $island
##                        diff        lwr       upr     p adj
## Antigua-Bonaire -9.09956631 -16.491319 -1.707813 0.0118676
## Barbuda-Bonaire -9.17905349 -16.570807 -1.787300 0.0110379
## Barbuda-Antigua -0.07948718  -9.447089  9.288114 0.9997732

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  2.1906 0.1189
##       76
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## island       2  903.1   451.5   13.19 9.77e-06 ***
## Residuals   88 3012.5    34.2                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 7 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = for_bites ~ island, data = vir, white.adjust = T)
## 
## $island
##                      diff         lwr         upr     p adj
## Antigua-Bonaire -7.291484 -10.6759972 -3.90697129 0.0000050
## Barbuda-Bonaire -3.692020  -7.4808559  0.09681671 0.0578051
## Barbuda-Antigua  3.599465  -0.1446468  7.34357613 0.0621841

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)   
## group  2  5.0607 0.00831 **
##       88                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4 Exploratory bivariate plots

4.1 Grazing as a function of fish length

Note: remove G1 grazing instances here?

4.2 Grazing as a function of site traits

Notes: - scarid biomass is not the best predictor once I account for differences between my samples in terms of the sizes of fish I was sampling. I think the grazing/length relationships are much stronger.
- restraining sample size to length windows lowers sample size and makes trends much less pronounced - esp. for phase differences
- reducing sample to individual phase only also blurs trends

4.3 Competitive interactions

5 Model trials

  • site-level predictors: scar_bm,scar_den,carn_bm,benthic (pc1,pc2)
  • fish-level predictors: species, phase, length
  • eventually run separately for different species
  • species*scar_bm interaction?
  • random effects: island
  • response variables: g_frac, br (?), and fr (run separately)

5.1 Mixed effects models

  • random effect: island
## Linear mixed-effects model fit by REML
##  Data: filter(sum_id_pca1, species_code != "rbp") 
##        AIC      BIC    logLik
##   9322.943 9358.316 -4653.471
## 
## Random effects:
##  Formula: ~1 | island
##         (Intercept) Residual
## StdDev:    442.1831 450.4068
## 
## Fixed effects: fr ~ phase + length_cm + species + pc1 + pc2 
##                             Value Std.Error  DF    t-value p-value
## (Intercept)             1326.6141 270.99370 613   4.895369  0.0000
## phaset                  -163.0290  50.24074 613  -3.244957  0.0012
## length_cm                 -9.1235   3.52618 613  -2.587367  0.0099
## speciesSparisoma viride -623.1907  37.04353 613 -16.823201  0.0000
## pc1                       89.0698  44.32152 613   2.009628  0.0449
## pc2                        8.7783  31.20623 613   0.281299  0.7786
##  Correlation: 
##                         (Intr) phaset lngth_ spcsSv pc1   
## phaset                   0.168                            
## length_cm               -0.307 -0.651                     
## speciesSparisoma viride -0.105 -0.067  0.085              
## pc1                      0.037  0.050  0.021  0.008       
## pc2                      0.046  0.007 -0.012  0.003  0.114
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.89990834 -0.56004163 -0.01254647  0.53166747  5.13083127 
## 
## Number of Observations: 621
## Number of Groups: 3
##     phase length_cm   species       pc1       pc2 
##  1.749662  1.751095  1.007674  1.021045  1.013411

5.2 GAM + GAMM

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ species + phase + s(length_cm) + s(scar_bm) + s(pc1) + s(pc2)
## 
## Parametric coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    1088.72      27.46  39.652  < 2e-16 ***
## speciesSparisoma aurofrenatum  -508.69      50.10 -10.153  < 2e-16 ***
## speciesSparisoma viride        -618.17      33.37 -18.526  < 2e-16 ***
## phaset                         -111.30      39.36  -2.828  0.00481 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F  p-value    
## s(length_cm) 1.000  1.000 16.207 6.22e-05 ***
## s(scar_bm)   1.006  1.009  2.544  0.11121    
## s(pc1)       3.481  4.039 33.161  < 2e-16 ***
## s(pc2)       3.814  4.375  3.524  0.00476 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.513   Deviance explained = 52.1%
## -REML = 5772.3  Scale est. = 1.627e+05  n = 782

5.3 ```

To Do as of Nov. 7
* boosted regression trees ecosphere 2017 adrians paper
* species as random effect